explain algorithm with easy problem
The Metropolis algorithm and improved versions like Gibbs perform generally well, but they have problems when parameters are correlated.
Take for example these data, with high colinearity, akin the 2-legs example:
set.seed(1)
Sigma = matrix(c(1,.95,.95,1), nrow = 2)
X = MASS::mvrnorm(100,mu = c(0,0), Sigma = Sigma)
Y = X %*% c(1,1) + rnorm(100)
XY = cbind(X1 = X[,1], X2 = X[,2], Y = Y)
colnames(XY)[3] = "Y"
par(mar=c(3,3,2,1), mgp=c(1.25,.5,0), tck=-.01)
pairs(XY)
If we estimate a model \(\small Y ~ Normal(\beta_1 X_1 \ + \beta_2 X_2, \sigma)\) the coefficients \(\small \beta_1\) and \(\small \beta_2\) are highly correlated.
Lets see what the Metropolis sampler does with this:
Metropolis sampler for hard problem
We can observe following things:
This is just one example where Metropolis or Gibbs samplers can have difficulties.
One sampling algorithm that does much better than Metropolis is Gibbs is Hamiltonian Monte Carlo. On an intuitive level, the key difference in favor of Hamiltonian Monte Carlo is that is generates proposals less randomly then either Metropolis or Gibbs. Less randomly hear means that the direction from the current sample to the proposal is not random, but that proposals are more likely in direction of the bulk of the posterior distribution.
To clarify the relationship of different terms:
ulam is a function in the rethinking
packages that translates rethinking models (alist(...))
into the Stan language runs the samplerR
(e.g. fit = brms(y ~ x1 + x2, data = dt) ) and
rstan and cmdstandr, which require that that
the use writes the model directly in the Stan progamming language.(wild chain)